feat(rivetkit-core): expose metrics endpoint by NathanFlurry · Pull Request #5013 · rivet-dev/rivet

NathanFlurry · 2026-05-11T12:05:37Z

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

railway-app · 2026-05-11T12:05:46Z

🚅 Deployed to the rivet-pr-5013 environment in rivet-frontend

Service	Status	Web	Updated (UTC)
kitchen-sink	❌ Build Failed (View Logs)	Web	May 14, 2026 at 5:27 pm
website	😴 Sleeping (View Logs)	Web	May 12, 2026 at 2:37 am
mcp-hub	✅ Success (View Logs)	Web	May 11, 2026 at 12:07 pm
frontend-cloud	❌ Build Failed (View Logs)	Web	May 11, 2026 at 12:06 pm
ladle	❌ Build Failed (View Logs)	Web	May 11, 2026 at 12:06 pm
frontend-inspector	❌ Build Failed (View Logs)	Web	May 11, 2026 at 12:06 pm

NathanFlurry · 2026-05-11T12:05:53Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

claude · 2026-05-11T12:08:14Z

Code Review: feat(rivetkit-core): expose metrics endpoint

Overview

This PR adds a /metrics Prometheus endpoint to rivetkit-core, implemented across two code paths: the registry HTTP dispatcher (registry/http.rs) and the serverless runtime (serverless.rs). Authorization uses constant-time bearer token comparison via subtle::ConstantTimeEq. The endpoint is controlled by two env vars: RIVETKIT_METRICS_ENABLED=1 and RIVETKIT_METRICS_TOKEN.

Issues

1. Dead code in handle_framework_fetch match arm

The early return in handle_fetch before active_actor() means the FrameworkHttpRoute::Metrics arm inside handle_framework_fetch is unreachable:

// handle_fetch: always returns early for Metrics
if matches!(route, RegistryHttpRoute::Framework(FrameworkHttpRoute::Metrics)) {
    return handle_metrics_fetch(&request);
}
let instance = match self.active_actor(actor_id).await { ... };
// ...
// handle_framework_fetch: this arm is dead code
FrameworkHttpRoute::Metrics => handle_metrics_fetch(&request),

The intent (serving metrics without requiring an active actor) is correct, but the unreachable match arm is confusing. Consider removing Metrics from the handle_framework_fetch match and using _ => unreachable!() there (acceptable per CLAUDE.md for explicit asserts), or restructure to remove the early return and add a dedicated pre-actor check in the Err branch instead.

2. Dual env-var design is a footgun

Requiring both RIVETKIT_METRICS_ENABLED=1 AND RIVETKIT_METRICS_TOKEN is awkward. If a user sets the token but forgets the enabled flag, they get a silent 403 with no diagnostic. Presence of a non-empty RIVETKIT_METRICS_TOKEN alone should be sufficient to enable the endpoint — drop RIVETKIT_METRICS_ENABLED.

3. Inconsistent error format for Prometheus render failures (serverless path)

In serverless.rs, when render_prometheus_metrics() fails, the code calls error_response(error) which returns a structured JSON body. Prometheus scrapers expect text/plain — a JSON body causes confusing parse errors. The http.rs path correctly propagates the error with ?. The serverless path should either return a plain-text error body or propagate upward with ? as well.

4. Env vars read on every request

configured_metrics_token() reads env vars on every scrape. This means a token change takes effect without restart, which is unexpected behavior for a security credential. Consider reading once at startup and caching the result, or document this behavior explicitly.

Suggestions

Magic number 6 in bearer_token_from_authorization: use "bearer".len() instead of the bare literal.
Linear scan in authorization_bearer_token_map: headers.iter().find(|(name, _)| name.eq_ignore_ascii_case(...)) is O(n). If callers normalize header keys to lowercase, .get("authorization") is O(1).
Missing WWW-Authenticate header on 401: RFC 9110 §11.3 requires it. Consider WWW-Authenticate: Bearer realm="metrics".
403 vs 404 for disabled endpoint: Returning 403 "metrics not enabled" reveals the endpoint exists but is disabled. 404 would be less informative to scanners. Either is defensible — worth a deliberate choice.

Positives

Constant-time token comparison via subtle::ConstantTimeEq is the right security call.
Method validation (GET-only) happens before auth, returning 405 cleanly.
Early return before actor resolution correctly makes metrics available when no actor is active.
Module visibility (pub(crate)) is appropriate.
Bearer token parsing handles edge cases well (empty token, case-insensitive scheme, leading/extra whitespace).

Missing

No tests added. The bearer token parser (bearer_token_from_authorization) and authorization logic have enough edge cases (empty token, case variants, whitespace) to warrant unit tests in tests/ per the CLAUDE.md Rust test layout rule. The handler paths in both http.rs and serverless.rs should also have integration-level test coverage.

railway-app Bot had a problem deploying to rivet-frontend / rivet-pr-5013 May 11, 2026 12:05 Failure

This was referenced May 11, 2026

feat(rivetkit): expose low-cardinality metrics routes #5019

Draft

feat(rivetkit): add current actor metrics #5020

Draft

fix(rivetkit): require engine ping for health #5022

Draft

abcxff mentioned this pull request May 11, 2026

temp: patch wasm-pack to point to latest binary #5037

Draft

11 tasks

MasterPtato mentioned this pull request May 12, 2026

fix(envoy-client): dont abort drain after 20s #5043

Draft

11 tasks

feat(rivetkit-core): expose metrics endpoint

f21c5de

MasterPtato force-pushed the 05-09-wip_counter_latency_monitor branch from dc95591 to f602228 Compare May 14, 2026 17:26

MasterPtato force-pushed the counter-latency/core-metrics-endpoint branch from 87b147e to f21c5de Compare May 14, 2026 17:26

railway-app Bot had a problem deploying to rivet-frontend / rivet-pr-5013 May 14, 2026 17:26 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rivetkit-core): expose metrics endpoint#5013

feat(rivetkit-core): expose metrics endpoint#5013
NathanFlurry wants to merge 1 commit into
05-09-wip_counter_latency_monitorfrom
counter-latency/core-metrics-endpoint

NathanFlurry commented May 11, 2026

Uh oh!

railway-app Bot commented May 11, 2026 •

edited

Loading

Uh oh!

NathanFlurry commented May 11, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NathanFlurry commented May 11, 2026

Description

Type of change

How Has This Been Tested?

Checklist:

Uh oh!

railway-app Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NathanFlurry commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: feat(rivetkit-core): expose metrics endpoint

Overview

Issues

Suggestions

Positives

Missing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

railway-app Bot commented May 11, 2026 •

edited

Loading

NathanFlurry commented May 11, 2026 •

edited

Loading

claude Bot commented May 11, 2026 •

edited

Loading